A Phonological Phrase Sequence Modelling Approach for Resource Efficient and Robust Real-Time Punctuation Recovery

نویسندگان

  • Anna Moró
  • György Szaszák
چکیده

For the automatic punctuation of Automatic Speech Recognition (ASR) output, both prosodic and text based features are used, often in combination. Pure prosody based approaches usually have low computation needs, introduce little latency (delay) and they are also more robust to ASR errors. Text based approaches usually yield better performance, they are however resource demanding (both regarding their training and computational needs), often introduce high time latency and are more sensitive to ASR errors. The present paper proposes a lightweight prosody based punctuation approach following a new paradigm: we argue in favour of an all-inclusive modelling of speech prosody instead of just relying on distinct acoustic markers: first, the entire phonological phrase structure is reconstructed, then its close correlation with punctuations is exploited in a sequence modelling approach with recurrent neural networks. With this tiny and easy to implement model we reach performance in Hungarian punctuation comparable to large, text based models for other languages by keeping resource requirements minimal and suitable for real-time operation with low latency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A robust engineering approach for wind turbine blade profile aeroelastic computation

Wind turbines are important devices that extract clean energy from wind flow. The efficiency of wind turbines should be examined under various working conditions in order to estimate off-design performance. Numerous aerodynamic and structural research works have been carried out to compute aeroelastic effects on wind turbines. Most of them suffer from either the simplicity of the modelling ...

متن کامل

A robust engineering approach for wind turbine blade profile aeroelastic computation

Wind turbines are important devices that extract clean energy from wind flow. The efficiency of wind turbines should be examined under various working conditions in order to estimate off-design performance. Numerous aerodynamic and structural research works have been carried out to compute aeroelastic effects on wind turbines. Most of them suffer from either the simplicity of the modelling appr...

متن کامل

ROBUST RESOURCE-CONSTRAINED PROJECT SCHEDULING WITH UNCERTAIN-BUT-BOUNDED ACTIVITY DURATIONS AND CASH FLOWS I. A NEW SAMPLING-BASED HYBRID PRIMARY-SECONDARY CRITERIA APPROACH

This paper, we presents a new primary-secondary-criteria scheduling model for resource-constrained project scheduling problem (RCPSP) with uncertain activity durations (UD) and cash flows (UC). The RCPSP-UD-UC approach producing a “robust” resource-feasible schedule immunized against uncertainties in the activity durations and which is on the sampling-based scenarios may be evaluated from a cos...

متن کامل

A CRF Sequence Labeling Approach to Chinese Punctuation Prediction

This paper presents a conditional random fields based labeling approach to Chinese punctuation prediction. To this end, we first reformulate Chinese punctuation prediction as a multiple-pass labeling task on a sequence of words, and then explore various features from three linguistic levels, namely words, phrase and functional chunks for punctuation prediction under the framework of conditional...

متن کامل

Better Punctuation Prediction with Hierarchical Phrase-Based Translation

Punctuation prediction is an important task in spoken language translation and can be performed by using a monolingual phrase-based translation system to translate from unpunctuated to text with punctuation. However, a punctuation prediction system based on phrase-based translation is not able to capture long-range dependencies between words and punctuation marks. In this paper, we propose to e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017